Creating Interactive Graphs for the Web
Usefulness in a data science workflow
Exploratory work:
Expository work:
Share-able, portable, composable (i.e., reports, dashboards, etc)
Grammar of graphics
- Leland Wilkinson (1980): A grammar of graphics is a framework which follows a layered approach to describe and construct visualizations or graphics in a structured manner.
Many libraries abide by this grammar: ggplot2, plotly, D3 and many others.
Grammar of graphics and ggplot2
The gg in ggplot2 stands for Grammar of Graphics.
“The transferrable skills from ggplot2 are not the idiosyncracies of plotting syntax, but a powerful way of thinking about visualisation, as a way of mapping between variables and the visual properties of geometric objects that you can perceive.”
Advantages
- Functional data visualization
- Clean and wrangle data
- Map data to visual elements
- Tweak scales, guides, axis, labels and theme
- Map data to visual elements
- Ease of iteration
- Ease in maintaining consistency
Prepare your system
Update R and RStudio to the latest, released version: https://www.rstudio.com/products/rstudio/download/
Download and load the following packages:
install.packages(c("devtools", "knitr",
"usethis", "tidyverse",
"plotly", "r2d3",
"crosstalk", "shiny",
"flexdashboard"))Data source: espnscraperR, which collects or scrapes QBR, NFL standings, and stats from ESPN
library(tidyverse)
library(espnscrapeR)
nfl_qbr <-
crossing(season = 2020, week = 1:6) %>%
pmap_dfr(espnscrapeR::get_nfl_qbr)Consider working in the preview version of RStudio as well for latest features (that are more stable than the daily build): https://www.rstudio.com/products/rstudio/download/preview/
Plotly
Plotly R package allows you to create a variety of interactive graphics
Two ways:
- transforming a ggplot2 object into a plotly object via
ggplotly() - directly initializing a plotly object with
plot_ly(),plot_geo()orplot_mapbox()
- transforming a ggplot2 object into a plotly object via
There are strengths, weaknesses to either approach
Learning both will pay dividends, as they share a common grammar and can be reused
Intro to ggplotly()
library(plotly)
nfl_qbr_plot <- nfl_qbr %>%
ggplot(aes(x = total_epa, y = qbr_total, label = short_name)) +
geom_point() +
geom_smooth(method = "lm") +
geom_label() +
theme_minimal() +
labs(
x = "EPA", y = "QBR",
title = "EPA is correlated with QBR"
)
nfl_qbr_plotIntro to plot_ly()
Any plot made with
plot_ly()uses the JavaScript library plotly.jsplot_ly()interfaced directly with plotly.jsplot_lyhas arguments that fit into the “Grammar of Graphics”:x,ycolor,stroke,span,symbol,linetype
There is a family of add_* functions: - add_lines() - add_bars - add_histogram2d() - add_contour - add_boxplot - …and many more!
The plotly package takes a pure functional approach to a layered grammar of graphics, meaning (almost) every function anticipates a plotly object as input to it’s first argument and returns a modified version of that plotly object.
Example: the layout() function anticipates a plotly object in it’s first argument and it’s other arguments add and/or modify various layout components of that object (e.g., the title):
layout(
plot_ly(nfl_qbr %>% filter(game_week == 3), x = ~short_name, y = ~total_epa, color = ~rank),
title = "QB Total EPA for Week 3"
)Notice that data manipulation verbs from the dplyr package (such as filter()) can be used to transform the data underlying a plotly object!
Or, in a more cleaner fashion:
nfl_qbr %>%
filter(game_week == 3) %>%
plot_ly(x = ~short_name, y = ~total_epa, color = ~rank) %>%
add_bars() Add a layer of text using the summarized counts. Note that the global x mapping, as well as the other mappings local to this text layer (text and y), reflect data values from step 3
nfl_qbr %>%
filter(game_week == 3) %>%
plot_ly(x = ~short_name, y = ~total_epa, color = ~rank) %>%
add_bars() %>%
add_text(
text = ~team,
textposition = "top middle",
cliponaxis = F
)To recap:
Globally assign
short_name,total_epa,ranktox,yandcolor, respectivelyAdd a bars layer (which inherits the y from
plot_ly)Use dplyr verbs to modify the data underlying the plotly object
Add additional layers (like a layer of text)
Share
Lessons learned
- The amount of interactive techniques is overwhelming
- Focus should be on identifying an analysis task/question first
- There will be different ways of doing the same thing, pick the most sensible to you
Ideally, according to Carson Sievert:
Go for R packages/technologies for creating interactive web graphics which:
- Don’t require knowledge of web technologies (start-up cost)
- Produce standalone HTML whenever possible (hosting/maintenance cost)
- Work well with other “tidy” tools in R (iteration cost)
- Link to external vis libraries (startover cost)
- Easy to use interactive techniques that support data analysis tasks (discovery cost)
References
Interactive web-based data visualization with R, plotly, and shiny by Carson Sievert: https://plotly-r.com/index.html
A Gentle Guide to the Grammar of Graphics with ggplot2 by Garrick Aden-Buie*: https://pkg.garrickadenbuie.com/trug-ggplot2/#1
espnscrapeR package by Thomas Mock: https://jthomasmock.github.io/espnscrapeR/index.html
My Talk on Grammar of Graphics: The Secret Sauce of Powerful Data Stories by Ganes Kesari: https://medium.com/@kesari/my-talk-on-grammar-of-graphics-the-secret-sauce-of-powerful-data-stories-3da618cf1bbf